智能论文笔记

Placing (Historical) Facts on a Timeline: A Classification cum Coref Resolution Approach

Sayantan Adak , Altaf Ahmad , Aditya Basu , Animesh Mukherjee

分类：自然语言处理

2022-06-28

时间轴提供了最有效的方法之一，可以看到一段时间内发生的重要历史事实，从而呈现出从文本形式阅读等效信息的见解。通过利用生成的对抗性学习进行重要的句子分类，并通过吸收基于知识的标签来改善事件核心分辨率的性能，我们从多个（历史）文本文档中引入了两个分阶段的事件时间表生成的系统。我们在两个手动注释的历史文本文档上演示了我们的结果。我们的结果对历史学家，推进历史研究以及理解一个国家的社会政治格局的研究对历史学家来说非常有帮助。

translated by 谷歌翻译

Detecting Severity of Diabetic Retinopathy from Fundus Images using Ensembled Transformers

Chandranath Adak , Tejas Karkera , Soumi Chattopadhyay , Muhammad Saqib

分类：计算机视觉 | 人工智能

2023-01-03

Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.

translated by 谷歌翻译

Deep Learning for Unsupervised Anomaly Localization in Industrial Images: A Survey

Xian Tao , Xinyi Gong , Xin Zhang , Shaohua Yan , Chandranath Adak

分类：计算机视觉

2022-07-21

当前，借助监督学习方法，基于深度学习的视觉检查已取得了非常成功的成功。但是，在实际的工业场景中，缺陷样本的稀缺性，注释的成本以及缺乏缺陷的先验知识可能会使基于监督的方法无效。近年来，无监督的异常定位算法已在工业检查任务中广泛使用。本文旨在通过深入学习在工业图像中无视无视的异常定位中的最新成就来帮助该领域的研究人员。该调查回顾了120多个重要出版物，其中涵盖了异常定位的各个方面，主要涵盖了所审查方法的各种概念，挑战，分类法，基准数据集和定量性能比较。在审查迄今为止的成就时，本文提供了一些未来研究方向的详细预测和分析。这篇综述为对工业异常本地化感兴趣的研究人员提供了详细的技术信息，并希望将其应用于其他领域的异常本质。

translated by 谷歌翻译

Deep Analysis of Visual Product Reviews

Chandranath Adak , Soumi Chattopadhyay , Muhammad Saqib

分类：计算机视觉

2022-07-19

随着电子商务行业的扩散，分析客户反馈是服务提供商必不可少的。最近几天，可以注意到，客户以评论分数上传购买的产品图像。在本文中，我们承担了分析此类视觉评论的任务，这是非常新的。过去，研究人员致力于分析语言反馈，但是在这里，我们没有从语言评论中获得任何可能不存在的帮助，因为可以观察到最近的趋势，客户喜欢快速上传视觉反馈而不是输入语言反馈。我们提出了一个分层体系结构，高级模型参与产品分类，而低级模型则注意从客户提供的产品图像预测评论得分。我们通过采购真实的视觉产品评论来生成数据库，这非常具有挑战性。我们的体系结构通过对所采用的数据库进行广泛的实验，从而获得了一些有希望的结果。拟议的分层体系结构比单层最佳可比架构的性能提高了57.48％。

translated by 谷歌翻译

Segmentation of Blood Vessels, Optic Disc Localization, Detection of Exudates and Diabetic Retinopathy Diagnosis from Digital Fundus Images

Soham Basu , Sayantan Mukherjee , Ankit Bhattacharya , Anindya Sen

分类：计算机视觉

2022-07-09

糖尿病性视网膜病（DR）是长期存在的，未经检查的糖尿病的并发症，是世界上失明的主要原因之一。本文着重于改进且可靠的方法，以提取DR，VIZ的某些功能。血管和渗出液。使用多个形态和阈值手术分割血管。对于渗出液的分割，使用了原始图像上的K均值聚类和轮廓检测。进行大量降噪以消除血管分割算法的结果中的假阳性。还执行了使用K-均值聚类和模板匹配的光盘定位。最后，本文提出了一个深卷卷神经网络（DCNN）模型，具有14个卷积层和2个完全连接的层，用于自动，二元诊断。血管分割，视盘定位和DCNN的精度分别为95.93％，98.77％和75.73％。源代码和预培训模型可用https://github.com/sohambasu07/dr_2021

translated by 谷歌翻译

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jakob Hollenstein , Sayantan Auddy , Matteo Saveriano , Erwan Renaudo , Justus Piater

分类：机器学习 | 人工智能

2022-06-08

许多深厚的增强学习算法依赖于简单的探索形式，例如经常在连续控制域中使用的加性动作噪声。通常，该动作噪声的缩放因子被选为高参数，并在训练过程中保持恒定。在本文中，我们分析了学到的政策如何受到噪声类型，比例和缩放系数的影响。我们考虑了两种最突出的动作类型：高斯和ornstein-uhlenbeck噪声，并通过系统地改变噪声类型和规模参数以及测量感兴趣的变量（例如预期的政策回报和策略回报）来执行巨大的实验活动。探索期间的状态空间覆盖范围。对于后者，我们提出了一个新颖的状态空间覆盖量$ \ operatatorName {x} _ {\ Mathcal {u} \ text {rel}} $，对边界人工制品比以前提出的措施更强大。较大的噪声尺度通常会增加状态空间覆盖率。但是，我们发现使用较大的噪声量表增加空间覆盖范围通常是无益的。相反，在训练过程中降低噪声量表可以减少差异并通常改善学习绩效。我们得出的结论是，最好的噪声类型和尺度是环境取决于的，并且根据我们的观察结果，得出了指导选择动作噪声作为进一步优化的起点的启发式规则。

translated by 谷歌翻译

Mining the manifolds of deep generative models for multiple data-consistent solutions of ill-posed tomographic imaging problems

Sayantan Bhadra , Umberto Villa , Mark A. Anastasio

分类：计算机视觉

2022-02-10

通常，层析成像是一个不适合的反问题。通常，从断层扫描测量中获得了拟距对象的单个正则图像估计。但是，可能有多个与相同的测量数据一致的对象。生成此类替代解决方案的能力很重要，因为它可以实现成像系统的新评估。原则上，这可以通过后采样方法来实现。近年来，已经采用了深层神经网络进行后验采样，结果令人鼓舞。但是，此类方法尚未用于大规模断层成像应用。另一方面，经验抽样方法在大规模成像系统上可能是可行的，并且可以对实际应用实现不确定性量化。经验抽样涉及在随机优化框架内求解正规化的逆问题，以获得替代数据一致的解决方案。在这项工作中，提出了一种新的经验抽样方法，该方法计算了与同一获得的测量数据一致的层析成像逆问题的多个解决方案。该方法通过在基于样式的生成对抗网络（stylegan）的潜在空间中反复解决优化问题的运行，并受到通过潜在空间探索（PULSE）方法的照片启发，该方法是为超分辨率任务开发而成的。通过涉及两种程式化的层析成像模式的数值研究来证明和分析所提出的方法。这些研究确定了该方法执行有效的经验抽样和不确定性定量的能力。

translated by 谷歌翻译

A Novel Image Denoising Algorithm Using Concepts of Quantum Many-Body Theory

Sayantan Dutta , Adrian Basarab , Bertrand Georgeot , Denis Kouamé

分类：计算机视觉

2021-12-16

实际图像的稀疏表示是成像应用的非常有效的方法，例如去噪。近年来，随着计算能力的增长，利用一个或多个图像提取的补丁内冗余的数据驱动策略，以增加稀疏性变得更加突出。本文提出了一种新颖的图像去噪算法，利用了由量子多体理论的图像依赖性的基础。基于补丁分析，通过类似于量子力学的术语来形式化局部图像邻域中的相似度测量，可以有效地保留真实图像的局部结构的量子力学中的相互作用。这种自适应基础的多功能性质将其应用范围扩展到图像无关或图像相关的噪声场景，而无需任何调整。我们对当代方法进行严格的比较，以证明所提出的算法的去噪能力，无论图像特征，噪声统计和强度如何。我们说明了超参数的特性及其对去噪性能的各自影响，以及自动化规则，可以在实验设置中选择其值的自动化规则，其实际设置不可用。最后，我们展示了我们对诸如医用超声图像检测应用等实际图像的方法处理实际图像的能力。

translated by 谷歌翻译

Normative Modeling using Multimodal Variational Autoencoders to Identify Abnormal Brain Structural Patterns in Alzheimer Disease

Sayantan Kumar , Philip Payne , Aristeidis Sotiras

分类：机器学习

2021-10-10

Normative modelling is an emerging method for understanding the underlying heterogeneity within brain disorders like Alzheimer Disease (AD) by quantifying how each patient deviates from the expected normative pattern that has been learned from a healthy control distribution. Since AD is a multifactorial disease with more than one biological pathways, multimodal magnetic resonance imaging (MRI) neuroimaging data can provide complementary information about the disease heterogeneity. However, existing deep learning based normative models on multimodal MRI data use unimodal autoencoders with a single encoder and decoder that may fail to capture the relationship between brain measurements extracted from different MRI modalities. In this work, we propose multi-modal variational autoencoder (mmVAE) based normative modelling framework that can capture the joint distribution between different modalities to identify abnormal brain structural patterns in AD. Our multi-modal framework takes as input Freesurfer processed brain region volumes from T1-weighted (cortical and subcortical) and T2-weighed (hippocampal) scans of cognitively normal participants to learn the morphological characteristics of the healthy brain. The estimated normative model is then applied on Alzheimer Disease (AD) patients to quantify the deviation in brain volumes and identify the abnormal brain structural patterns due to the effect of the different AD stages. Our experimental results show that modeling joint distribution between the multiple MRI modalities generates deviation maps that are more sensitive to disease staging within AD, have a better correlation with patient cognition and result in higher number of brain regions with statistically significant deviations compared to a unimodal baseline model with all modalities concatenated as a single input.

translated by 谷歌翻译